智能论文笔记

Efficiently Learning Recoveries from Failures Under Partial Observability

Shivam Vats , Maxim Likhachev , Oliver Kroemer

分类：机器人 | 人工智能 | 机器学习

2022-09-27

在现实世界条件下运行的原因是由于部分可观察性引起的广泛故障而具有挑战性。在相对良性的环境中，可以通过重试或执行少量手工恢复策略之一来克服这种失败。相比之下，诸如打开门和组装家具之类的接触式连续操作任务不适合详尽的手工设计。为了解决这个问题，我们提出了一种以样本效率的方式来鲁棒化操作策略的一般方法。我们的方法通过在模拟中探索发现当前策略的故障模式，从而提高了鲁棒性，然后学习其他恢复技能来处理这些失败。为了确保有效的学习，我们提出了一种在线算法值上限限制（值UCL），该算法选择要优先级的故障模式以及要恢复到哪种状态，以使预期的性能在每个培训情节中最大程度地提高。我们使用我们的方法来学习开门的恢复技能，并在模拟和实际机器人中对其进行评估。与开环执行相比，我们的实验表明，即使是有限的恢复学习也可以从模拟中的71 \％提高到92.4 \％，从75 \％到90 \％的实际机器人。

translated by 谷歌翻译

Synergistic Scheduling of Learning and Allocation of Tasks in Human-Robot Teams

Shivam Vats , Oliver Kroemer , Maxim Likhachev

分类：机器人

2022-03-14

我们考虑使用最低限度的努力与人类机器人团队一起完成一组$ n $任务的问题。在许多领域中，如果有许多任务有限的任务，教机器人完全自主可能会适得其反。相反，最佳策略是权衡教授机器人及其好处的成本 - 它允许机器人自动解决多少新任务。我们将其作为规划问题提出，目的是确定机器人应自动执行的任务（ACT），应将哪些任务委派给人类（委托）以及应教授机器人的哪些任务（学习）以完成所有给定的任务都以最小的努力。这个计划问题导致搜索树以$ n $成倍增长 - 使标准图形搜索算法难以理解。我们通过将问题转换为混合整数程序来解决这个问题，该程序可以使用固定求解器有效地解决解决方案质量的范围。为了预测学习的好处，我们提出了一个先进的预测分类器。给定两个任务，该分类器预测接受培训的技能是否会转移到另一个。最后，我们在模拟和现实世界中评估了有关PEG插入和乐高堆叠任务的方法，显示了人类努力的大量节省。

translated by 谷歌翻译

Quantum-Inspired Tensor Neural Networks for Option Pricing

Raj G. Patel , Chia-Wei Hsing , Serkan Sahin , Samuel Palmer , Saeed S. Jahromi , Shivam Sharma , Tomas Dominguez , Kris Tziritas , Christophe Michel , Vincent Porte

分类：机器学习

2022-12-28

Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.

translated by 谷歌翻译

This changes to that : Combining causal and non-causal explanations to generate disease progression in capsule endoscopy

Anuja Vats , Ahmed Mohammed , Marius Pedersen , Nirmalie Wiratunga

分类：机器学习 | 人工智能 | 计算机视觉

2022-12-05

Due to the unequivocal need for understanding the decision processes of deep learning networks, both modal-dependent and model-agnostic techniques have become very popular. Although both of these ideas provide transparency for automated decision making, most methodologies focus on either using the modal-gradients (model-dependent) or ignoring the model internal states and reasoning with a model's behavior/outcome (model-agnostic) to instances. In this work, we propose a unified explanation approach that given an instance combines both model-dependent and agnostic explanations to produce an explanation set. The generated explanations are not only consistent in the neighborhood of a sample but can highlight causal relationships between image content and the outcome. We use Wireless Capsule Endoscopy (WCE) domain to illustrate the effectiveness of our explanations. The saliency maps generated by our approach are comparable or better on the softmax information score.

translated by 谷歌翻译

Design of an All-Purpose Terrace Farming Robot

Vibhakar Mohta , Adarsh Patnaik , Shivam Kumar Panda , Siva Vignesh Krishnan , Abhinav Gupta , Abhay Shukla , Gauri Wadhwa , Shrey Verma , Aditya Bandopadhyay

分类：机器人

2022-12-04

Automation in farming processes is a growing field of research in both academia and industries. A considerable amount of work has been put into this field to develop systems robust enough for farming. Terrace farming, in particular, provides a varying set of challenges, including robust stair climbing methods and stable navigation in unstructured terrains. We propose the design of a novel autonomous terrace farming robot, Aarohi, that can effectively climb steep terraces of considerable heights and execute several farming operations. The design optimisation strategy for the overall mechanical structure is elucidated. Further, the embedded and software architecture along with fail-safe strategies are presented for a working prototype. Algorithms for autonomous traversal over the terrace steps using the scissor lift mechanism and performing various farming operations have also been discussed. The adaptability of the design to specific operational requirements and modular farm tools allow Aarohi to be customised for a wide variety of use cases.

translated by 谷歌翻译

What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes

Shivam Sharma , Siddhant Agarwal , Tharun Suresh , Preslav Nakov , Md. Shad Akhtar , Tanmoy Charkraborty

分类：自然语言处理

2022-12-01

Memes are powerful means for effective communication on social media. Their effortless amalgamation of viral visuals and compelling messages can have far-reaching implications with proper marketing. Previous research on memes has primarily focused on characterizing their affective spectrum and detecting whether the meme's message insinuates any intended harm, such as hate, offense, racism, etc. However, memes often use abstraction, which can be elusive. Here, we introduce a novel task - EXCLAIM, generating explanations for visual semantic role labeling in memes. To this end, we curate ExHVV, a novel dataset that offers natural language explanations of connotative roles for three types of entities - heroes, villains, and victims, encompassing 4,680 entities present in 3K memes. We also benchmark ExHVV with several strong unimodal and multimodal baselines. Moreover, we posit LUMEN, a novel multimodal, multi-task learning framework that endeavors to address EXCLAIM optimally by jointly learning to predict the correct semantic roles and correspondingly to generate suitable natural language explanations. LUMEN distinctly outperforms the best baseline across 18 standard natural language generation evaluation metrics. Our systematic evaluation and analyses demonstrate that characteristic multimodal cues required for adjudicating semantic roles are also helpful for generating suitable explanations.

translated by 谷歌翻译

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Benjamin Kiefer , Matej Kristan , Janez Perš , Lojze Žust , Fabio Poiesi , Fabio Augusto de Alcantara Andrade , Alexandre Bernardino , Matthew Dawkins , Jenni Raitoharju , Yitong Quan

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2022-11-24

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.

translated by 谷歌翻译

Domain-aware Self-supervised Pre-training for Label-Efficient Meme Analysis

Shivam Sharma , Mohd Khizir Siddiqui , Md. Shad Akhtar , Tanmoy Chakraborty

分类：自然语言处理 | 人工智能

2022-09-29

现有的自我监督学习策略被限制在有限的目标或主要针对单峰应用程序的通用下游任务。对于复杂性和域亲和力（例如模因分析）而言，这对命令性的多模式应用有了孤立的进展。在这里，我们介绍了两种自我监督的预训练方法，即ext-pie-net和mm-simclr（i）在预训练期间使用现成的多模式仇恨语音数据，并且（ii）执行自我 - 通过合并多个专业借口任务，有效地迎合模因分析所需的复杂多模式表示学习，从而有效地迎合了学习。我们实验不同的自我实验策略，包括可以帮助学习丰富的跨模式表示并使用流行的线性探测来评估可恨模因任务的潜在变体。拟议的解决方案通过标签有效的培训与完全监督的基线竞争，同时在梅诺特挑战的所有三个任务上明显优于他们，分别为0.18％，23.64％和0.93％的绩效增长。此外，我们通过在Harmeme任务上报告竞争性能来证明所提出的解决方案的普遍性。最后，我们通过分析特定于任务的学习，使用更少的标记培训样本来建立学习表现的质量，并争辩说，自主策略和手头下游任务的复杂性是相关的。我们的努力强调了更好的多模式自学方法的要求，涉及有效的微调和可推广性能的专业借口任务。

translated by 谷歌翻译

Multiple Waypoint Navigation in Unknown Indoor Environments

Shivam Sood , Jaskaran Singh Sodhi , Parv Maheshwari , Karan Uppal , Debashish Chakravarty

分类：机器人

2022-09-18

室内运动计划的重点是解决通过混乱环境导航代理的问题。迄今为止，在该领域已经完成了很多工作，但是这些方法通常无法找到计算廉价的在线路径计划和路径最佳之间的最佳平衡。除此之外，这些作品通常证明是单一启动单目标世界的最佳性。为了应对这些挑战，我们为在未知室内环境中进行导航的多个路径路径计划者和控制器堆栈，在该环境中，路点将目标与机器人必须在达到目标之前必须穿越的中介点一起。我们的方法利用全球规划师（在任何瞬间找到下一个最佳航路点），本地规划师（计划通往特定航路点的路径）以及自适应模型预测性控制策略（用于强大的系统控制和更快的操作）。我们在一组随机生成的障碍图，中间航路点和起始目标对上评估了算法，结果表明计算成本显着降低，具有高度准确性和可靠的控制。

translated by 谷歌翻译

The SZ flux-mass ($Y$-$M$) relation at low halo masses: improvements with symbolic regression and strong constraints on baryonic feedback

Digvijay Wadekar , Leander Thiele , J. Colin Hill , Shivam Pandey , Francisco Villaescusa-Navarro , David N. Spergel , Miles Cranmer , Daisuke Nagai , Daniel Anglés-Alcázar , Shirley Ho

分类：人工智能 | 机器学习

2022-09-05

光环伴形培养基中的离子气体通过热阳光阳光层（TSZ）效应在宇宙微波背景上留下烙印。来自活性银河核（AGN）和超新星的反馈会影响晕孔集成TSZ通量的测量（$ y_ \ mathrm {sz} $），并导致其与光晕质量的关系（$ y_ \ mathrm {sz} -mm $ ）偏离病毒定理的自相似幂律预测。我们对使用骆驼，一套流体动力模拟的套件进行了全面研究，反馈处方的差异很大。我们使用两个机器学习工具（随机森林和符号回归）的组合来搜索$ y-m $关系的类似物，这对低质量的反馈过程（$ m \ sillesim 10^{14} \，h^， {-1} \，m_ \ odot $）;我们发现，仅替换$ y \ rightarrow y（1+m _*/m_ \ mathrm {gas}）$在关系中使其非常相似。这可以用作低质量簇和星系组的强大多波长质量代理。我们的方法通常对于提高其他天体分级关系的有效性领域通常也很有用。我们还预测，$ y-m $关系的测量值可以在反馈参数的某些组合和/或排除超级新闻和AGN反馈模型的主要部分，以提供百分比的约束。艺术流体动力模拟。我们的结果对于使用即将进行的SZ调查（例如SO，CMB-S4）和Galaxy Surveys（例如Desi和Rubin）来限制Baryonic反馈的性质。最后，我们发现，$ y-m _*$的另一种关系提供了有关反馈的补充信息，而不是$ y-m $。

translated by 谷歌翻译